In a Random Process we know what outcomes could happen, but we don´t know which particular outcome will happen.
Notation: P(A) = Probability of event A
Rule: 0 <= P(A) <= 1
Definition:
Frequentist Interpretation: The probability of an outcome is the proportion of times the outcome would occur if we observed the random process and infinite number of times.
Bayesian Interpretation: A bayesian interprets probability as a subjective degree of belief. Popular in the late twenty years.
Law of large numbers states that as more observations are collected (more random process), the proportion of occurrences with a particular outcome converges to the probability of that outcome
Coin is memoryless: The probability of heads on the 11th toss is the same as the probability of heads in the 10th toss, or any previous tosses P(Head on the 11th toss) = P(Head on the 11th toss) = 0.5
Gambler´s fallacy/law of averages: Random processes are supposed to compensate for whatever happened in the past…. no, common misunderstanding of the law of large numbers
Disjoint Events (Mutually exclusive) cannot happen at the same time.
* A student can´t both fail and pass a class
* A singe card drawn from a deck cannot be an ace and a queen
Non-Disjoint Events Can happen at the same time
* A student can get an A in Stats and A in Econ in the same semester
P(A or B) = P(A) + P(B) - P(A and B)
and when the events are disjoint (P(A and B) = 0)..
P(A or B) = P(A) + P(B)A Sample space is a collection of all possible outcomes of a trial
A Probability distribution lists all possible outcomes in the sample space, and the probabilities with which thee occur
Rules * The events listed must be disjoint * Each probability must be between 0 and 1 * The probability must total 1 (the sum)
Complementary events are two mutually exclusive (disjoint) events whose probabilities add up to 1
Do the sum probabilities of two disjoint outcomes always add up to 1?
No there may be more than 2 outcomes in the sample space
Two processes are Independent if knowing the outcome of one provides no useful information about the outcome of the other
we use the expression Most likely dependent because we are dealing with sample data
If we observed difference between conditional probabilities (based in the sample) –> most likely dependent —> Hypothesis test to see if this difference is not due to chance
Important Note random selection implies independence
Observation: We can see a similarity between a dice roll and a sample
dice roll – we know the 6 options and their relative frequencies (each option repeat only one time) — if we roll the dice many times we get to the original relative frequencies.
The 6 options and their relative frequencies is like a population that we don´t know – if we take a sample of it we can get to the relative frequencies of the population
————————————————————————————————————————————————————————————
Disjointness is about events happening at the same time. While independence is about processes not affecting each other.
Note: Disjoint events with non zero probability are always dependent on each other.Because if we know that one happened, we know that the other one cannot happen
In a contingency table we look at the margins to calculate the marginal probabilities
In a contingency table we look at the intersection of the events of interest to calculate the joint probability. Also, in a Venn diagram we can see the joint probability as the intersection between the circles.
In a contingency table, we fix a column or row and only use the information registered on it.
In a formal way, we have the Baye´s Theorem
With this formula now we can have an expression for the join probability of two events that are dependent
See diapos
With probability trees we can calculate a posterior probability
posterior probability: P(hypothesis|data): It tells us the probability of a hypothesis we set forth, given the data we just observed
depends on:
This is different than what we calculated at the end of the randomization test on gender discrimination - the probability of observed or more extreme data given the null hypothesis being true, i.e. P(data|hypothesis) also called a p-value
In the bayesian approach, we evaluate claims iteratively as we collect more data.
In other words, we update our prior with our posterior probability from the previous iteration.